Complex tensor factorization in modulation frequency domain for single-channel speech enhancement

نویسندگان

  • Shogo Masaya
  • Masashi Unoki
چکیده

This paper proposes a novel method of speech enhancement using tensor factorization, which is extended from complex non-negative matrix factorization (CMF), in the modulation frequency domain. Non-negative matrix factorization (NMF) has attracted a great deal of attention as a recent approach to speech enhancement for its ease of feature detection in the acoustic frequency domain. However, previous studies have suggested that spectral processing like spectral subtraction in the modulation frequency domain has been an effective scheme for speech enhancement. The use of not only the amplitude information but also the phase information is required in the modulation frequency domain to utilize more information on speech. Thus, we present new tensor factorization on the complex spectrum in the modulation frequency domain for single-channel speech enhancement. The amplitude and phase spectrum in the acoustic frequency domain can be estimated by using the factorized complex spectra in the modulation frequency domain. Numerical experiments were carried out under several noisy conditions to evaluate the effectiveness of the proposed method. The signal to error ratio and signal to noise ratio loss were used as objective measures. The results revealed that the proposed method outperformed the existing methods of speech enhancement based on NMF and CMF.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Frequency Domain Linearly Constrained Minimum Variance Filter for Speech Enhancement

A reliable speech enhancement method is important for speech applications as a pre-processing step to improve their overall performance. In this paper, we propose a novel frequency domain method for single channel speech enhancement. Conventional frequency domain methods usually neglect the correlation between neighboring time-frequency components of the signals. In the proposed method, we take...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Multichannel speech dereverberation based on convolutive nonnegative tensor factorization for ASR applications

Room reverberation is a primary cause of failure in distant speech recognition (DSR) systems. In this study, we present a multichannel spectrum enhancement method for reverberant speech recognition, which is an extension of a single-channel dereverberation algorithm based on convolutive nonnegative matrix factorization (NMF). The generalization to a multichannel scenario is shown to be a specia...

متن کامل

Single-Channel Speech Enhancement Using Double Spectrum

Single-channel speech enhancement is often formulated in the Short-Time Fourier Transform (STFT) domain. As an alternative, several previous studies have reported advantages of speech processing using pitch-synchronous analysis and filtering in the modulation transform domain. We propose to use the Double Spectrum (DS) obtained by combining pitchsynchronous transform followed by modulation tran...

متن کامل

Nonnegative Tensor Factorization with Frequency Modulation Cues for Blind Audio Source Separation

We present Vibrato Nonnegative Tensor Factorization, an algorithm for single-channel unsupervised audio source separation with an application to separating instrumental or vocal sources with nonstationary pitch from music recordings. Our approach extends Nonnegative Matrix Factorization for audio modeling by including local estimates of frequency modulation as cues in the separation. This permi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015